Picture for Shijian Wang

Shijian Wang

MMSkills: Towards Multimodal Skills for General Visual Agents

Add code
May 14, 2026
Viaarxiv icon

AgentDisCo: Towards Disentanglement and Collaboration in Open-ended Deep Research Agents

Add code
May 12, 2026
Viaarxiv icon

Optimal Transport for LLM Reward Modeling from Noisy Preference

Add code
May 07, 2026
Viaarxiv icon

ClawMark: A Living-World Benchmark for Multi-Turn, Multi-Day, Multimodal Coworker Agents

Add code
Apr 26, 2026
Viaarxiv icon

MuSEAgent: A Multimodal Reasoning Agent with Stateful Experiences

Add code
Mar 29, 2026
Viaarxiv icon

GlyphBanana: Advancing Precise Text Rendering Through Agentic Workflows

Add code
Mar 12, 2026
Viaarxiv icon

OmniGAIA: Towards Native Omni-Modal AI Agents

Add code
Feb 26, 2026
Viaarxiv icon

Agent2World: Learning to Generate Symbolic World Models via Adaptive Multi-Agent Feedback

Add code
Dec 26, 2025
Viaarxiv icon

Attributed Synthetic Data Generation for Zero-shot Domain-specific Image Classification

Add code
Apr 06, 2025
Viaarxiv icon

Command A: An Enterprise-Ready Large Language Model

Add code
Apr 01, 2025
Figure 1 for Command A: An Enterprise-Ready Large Language Model
Figure 2 for Command A: An Enterprise-Ready Large Language Model
Figure 3 for Command A: An Enterprise-Ready Large Language Model
Figure 4 for Command A: An Enterprise-Ready Large Language Model
Viaarxiv icon